A framework for incremental generation of closed itemsets

نویسندگان

  • Petko Valtchev
  • Rokia Missaoui
  • Robert Godin
چکیده

Association rule mining from a transaction database (TDB) requires the detection of frequently occurring patterns, called frequent itemsets (FIs), whereby the number ofFIsmay be potentially huge. Recent approaches forFImining use the closed itemset paradigm to limit themining effort to a subset of the entireFI family, the frequent closed itemsets (FCIs).We show here howFCIs can bemined incrementally yet efficiently whenever a new transaction is added to a database whose mining results are available. Our approach for mining FIs in dynamic databases relies on recent results about lattice incremental restructuring and lattice construction. The fundamentals of the incremental FCI mining task are discussed and its reduction to the problem of lattice update, via the CI family, is made explicit. The related structural results underlie two algorithms for updating the set of FCIs of a given TDB upon the insertion of a new transaction. A straightforward method searches for necessary completions throughout the entire CI family, whereas a second method exploits lattice properties to limit the search to CIs which share at least one item with the new transaction. Efficient implementations of the parsimonious method is discussed in the paper together with a set of results from a preliminary study of the method’s practical performances. © 2007 Elsevier B.V. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Efficient Incremental Algorithm to Mine Closed Frequent Itemsets over Data Streams

The purpose of this work is to mine closed frequent itemsets from transactional data streams using a sliding window model. An efficient algorithm IMCFI is proposed for Incremental Mining of Closed Frequent Itemsets from a transactional data stream. The proposed algorithm IMCFI uses a data structure called INdexed Tree(INT) similar to NewCET used in NewMoment[5]. INT contains an index table Item...

متن کامل

An On-Line Approximation Algorithm for Mining Frequent Closed Itemsets Based on Incremental Intersection

We propose a new on-line ε-approximation algorithm for mining closed itemsets from a transactional data stream, which is also based on the incremental/cumulative intersection principle. The proposed algorithm, called LC-CloStream, is constructed by integrating CloStream algorithm and Lossy Counting algorithm. We investigate some behaviors of the LC-CloStream algorithm. Firstly we show the incom...

متن کامل

Mining Non- Redundant Frequent Pattern in Taxonomy Datasets using Concept Lattices

In general frequent itemsets are generated from large data sets by applying various association rule mining algorithms, these produce many redundant frequent itemsets. In this paper we proposed a new framework for Non-redundant frequent itemset generation using closed frequent itemsets without lose of information on Taxonomy Datasets using concept lattices. General Terms Frequent Pattern, Assoc...

متن کامل

Efficient Incremental Mining of Top-K Frequent Closed Itemsets

In this work we study the mining of top-K frequent closed itemsets, a recently proposed variant of the classical problem of mining frequent closed itemsets where the support threshold is chosen as the maximum value sufficient to guarantee that the itemsets returned in output be at least K. We discuss the effectiveness of parameter K in controlling the output size and develop an efficient algori...

متن کامل

Mining Frequent Closed Itemsets with the Frequent Pattern List

The mining of the complete set of frequent itemsets will lead to a huge number of itemsets. Fortunately, this problem can be reduced to the mining of frequent closed itemsets (FCIs), which results in a much smaller number of itemsets. The approaches to mining frequent closed itemsets can be categorized into two groups: those with candidate generation and those without. In this paper, we propose...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Discrete Applied Mathematics

دوره 156  شماره 

صفحات  -

تاریخ انتشار 2008